12 research outputs found

    Semantic Tagging on Historical Maps

    Full text link
    Tags assigned by users to shared content can be ambiguous. As a possible solution, we propose semantic tagging as a collaborative process in which a user selects and associates Web resources drawn from a knowledge context. We applied this general technique in the specific context of online historical maps and allowed users to annotate and tag them. To study the effects of semantic tagging on tag production, the types and categories of obtained tags, and user task load, we conducted an in-lab within-subject experiment with 24 participants who annotated and tagged two distinct maps. We found that the semantic tagging implementation does not affect these parameters, while providing tagging relationships to well-defined concept definitions. Compared to label-based tagging, our technique also gathers positive and negative tagging relationships. We believe that our findings carry implications for designers who want to adopt semantic tagging in other contexts and systems on the Web.Comment: 10 page

    Power Reduction Opportunities on End-User Devices in Quality-Steady Video Streaming

    Full text link
    This paper uses a crowdsourced dataset of online video streaming sessions to investigate opportunities to reduce the power consumption while considering QoE. For this, we base our work on prior studies which model both the end-user's QoE and the end-user device's power consumption with the help of high-level video features such as the bitrate, the frame rate, and the resolution. On top of existing research, which focused on reducing the power consumption at the same QoE optimizing video parameters, we investigate potential power savings by other means such as using a different playback device, a different codec, or a predefined maximum quality level. We find that based on the power consumption of the streaming sessions from the crowdsourcing dataset, devices could save more than 55% of power if all participants adhere to low-power settings.Comment: 4 pages, 3 figure

    Old maps and open data networks

    Get PDF
    Old maps are a record of the past, exposing features people might want to tell stories about. Maphub is a Web application that enables them to do so by creating annotations on digitized high-resolution historical maps. By semantically tagging regions on the map, users create associations between their annotations and resources in open Web-based data networks. These associations are leveraged to enable multilingual search and to generate overlays of historical maps on modern mapping applications. Contributed annotations are shared on the Web following the W3C Open Annotation specification. Preliminary studies show general user satisfaction with our approach.published or submitted for publicationis peer reviewe

    Challenges of future multimedia QoE monitoring for internet service providers

    Get PDF
    The ever-increasing network traffic and user expectations at reduced cost make the delivery of high Quality of Experience (QoE) for multimedia services more vital than ever in the eyes of Internet Service Providers (ISPs). Real-time quality monitoring, with a focus on the user, has become essential as the first step in cost-effective provisioning of high quality services. With the recent changes in the perception of user privacy, the rising level of application-layer encryption and the introduction and deployment of virtualized networks, QoE monitoring solutions need to be adapted to the fast changing Internet landscape. In this contribution, we provide an overview of state-of-the-art quality monitoring models and probing technologies, and highlight the major challenges ISPs have to face when they want to ensure high service quality for their customers

    Observer confidence in subjective quality evaluation

    No full text
    Können wir unseren Daten vertrauen? Wie viel Vertrauen kann man in die Bewertungen von VersuchsteilnehmerInnen setzen? Anders gefragt: Wie sicher waren sich die TeilnehmerInnen bei der Bewertung selbst? Wenn es um die Evaluierung von Multimedia-Qualität geht, können computer-generierte Schätzungen kaum subjektive Tests mit menschlichen TeilnehmerInnen ersetzen. Automatisierte Quality of Service (QoS) Messungen können zwar Faktoren wie Bitrate, Paketverlust, oder Signal to Noise Ratio mit einbeziehen und eine Schätzung über die resultierende Qualität für den User liefern, jedoch werden diese Methoden als ineffizient angesehen, da sie ineffizient und ungenau die Quality of Experience (QoE) voraussagen. Daher werden QoE-Experimente durchgeführt, um Ground-Truth-Daten für Modelle zu liefern, welche wiederum QoE auf Basis von QoS-Daten berechnen können. Um jedoch genaue Modelle zu generieren, müssen wir wissen, wie genau die Daten sind, die von ExperimentteilnehmerInnen geliefert wurden. Dokumente der ITU -- wie etwa ITU-T BT.500-13 oder ITU-T Rec. P.910 -- beschreiben, wie subjektive Experimente zur Messung von Multimedia-Qualität durchgeführt werden sollen. Sie inkludieren Prozeduren für die Datenanalyse und -auswertung. Außerdem wird beschrieben, welche Datensätze entfernt werden müssen, sollten Testpersonen in ihren Ergebnissen zu stark von den anderen TeilnehmerInnen abweichen. Die Bewertungen, die TeilnehmerInnen in Experimenten abgeben, werden typischerweise gemittelt -- dies ist der ``Mean Opinion Score'', also due Durchschnittsbewertung für einen Stimulus, über alle Testpersonen gesehen. Dieser MOS berücksichtigt jedoch nicht die Unterschiede zwischen den TeilnehmerInnen, oder etwa die Tatsache, dass sich ein(e) TeilnehmerIn bei der Bewertung nicht sicher gewesen ist und möglicherweise ungültige Daten abgegeben hat. MOS werden häufig mit ihrem 95% Konfidenzintervall präsentiert. Das Konfidenzintervall ist zwar ein gutes Zeichen für die Streuung der Bewertungen, aber zeigt nur, wie sehr der gefundene MOS sich dem tatsächlichen MOS nähert. Um die Gründe für Übereinstimmung zwischen Bewertungen verschiedener TeilnehmerInnen genauer zu erforschen, benötigen wir jedoch neue Bewertungsmethoden. In sieben Experimentreihen, durchgeführt an der Universität Wien sowie am Institut de Recherche en Communications et Cybernétique de Nantes in Frankreich, erforschen wir eine solche Methode, die die Selbstsicherheit der TeilnehmerInnen zum Hauptaugenmerk hat. Es stellt sich heraus, dass die bewertete Qualität nicht nur von dem Stimulus an sich, sondern auch von äußeren Faktoren, wie etwa der Testsituation oder der Persönlichkeit der Testperson abhängt. Auch die Bewertungsskala kann einen Einfluss auf die Selbstsicherheit haben. Gerade bei neuen Technologien wie etwa 3D-Fernsehen und -kino können WissenschafterInnen nicht zwangsläufig vorherige Ergebnisse heranziehen, um Qualitätsbewertungen vorzunehmen. Hier ist es wichtig, auch abzuschätzen, inwieweit neue Technologien TeilnehmerInnen verunsichern und damit ihre Bewertungen verfälschen. In dieser Arbeit soll mehreren Fragen nachgegangen werden, unter anderem, ob die Selbstsicherheit von ExperimentteilnehmerInnen effektiv gemessen werden kann, welche persönlichen Faktoren das Bewertungsverhalten beeinflussen, und welche Auswirkungen die Sicherheit wiederum auf die Qualitätsbewertungen hat. In unseren Experimenten berücksichtigen wir auch Persönlichkeitsmerkmale und versteckte Messungen, wie etwa die Bewertungszeit der TeilnehmerInnen. Wir zeigen auf, wie stark sich das individuelle Bewertungsverhalten zwischen Personen unterscheiden kann, und schlagen neue Analysemethoden für QoE-Experimente vor. Diese erlauben bessere Einblicke in Experimentdaten und sollen WissenschafterInnen helfen, QoE besser vorauszusagen.How much can we trust our data? How much confidence can we put into the ratings of our observers? Even more so: How confident were our observers when they were rating? No computer-generated estimate can substitute subjective tests with human observers when it comes to evaluating the Quality of Experience (QoE) of today's multimedia services. Automated Quality of Service (QoS) measurements that take into account factors such as the bitrate, packet loss, or signal to noise ratio may give an estimation of the resulting quality for the end user, but QoS-based methods have been proven inefficient at predicting the experienced quality, only offering a rough estimate. In turn, QoE experiments are conducted in order to give ground truth data for creating models that predict QoE on the base of QoS data. To generate accurate models, one needs to know whether the acquired data itself is accurate. There exist various documents by the ITU, such as ITU-T BT.500-13 or ITU-T Rec. P.910 which describe the way subjective multimedia quality experiments have to be conducted. They also include procedures on data analysis, which specify how experiment data has to be reported, and test persons have to be removed from the pool when their behavior is deviating from the others. The ratings acquired from viewers during experiment sessions are often simply put in a bowl. This is what we call the ``Mean Opinion Score'' (MOS)---the average score all observers assigned to a stimulus. The MOS does not take into account inter-personal differences or the fact that observers might not have been too sure on what they were even rating. Often, MOS are presented along with their 95% confidence intervals (CI). The CI is a good sign of agreement between observers, but only in the sense of how certain one can be that the found MOS conforms to the ``true'' MOS. To dive deeper into understanding the causes for (dis)agreement between observers, a new rating methodology that focuses on their confidence is evaluated over the course of seven different multimedia quality experiment sessions, conducted at the University of Vienna and the Institut de Recherche en Communications et Cybernétique de Nantes in France. Focusing on the confidence of observers, it becomes obvious that the estimated quality may not only depend on the actual stimulus, but even outside factors such as the test situation or the personality. Even the scale used for assigning quality values could have an influence on how confident observers might feel during a session. Also, with new emerging multimedia services such as 3D vision, one cannot assume previous experience of the observers with the technology, which might lower the confidence they put in their votes. In this thesis, we address multiple hypotheses, such as whether confidence can be measured effectively during experiments, what personal factors influence the voting behavior, and how the confidence of observers influences their quality votes. In our experiments, we also take into account personality traits and hidden measurements such as the reaction time of observers. We show that rating behavior differs from person to person. We propose new reporting and data analysis methods and formulate recommendations for the conduction of QoE experiments that will allow much deeper insight into the acquired data

    On the experimental biases in user behavior and QoE assessment in the lab

    No full text
    User behavior is one of the key components of customer engagement and abandonment, which result from a good or bad Quality of Experience. However, methods to evoke and measure user behavior are still understudied. This paper presents an in-depth look at a study in which we measured user behavior during video streaming consumption in a controlled laboratory environment. We confronted subjects with typical streaming problems such as stalling and quality fluctuations. The subjects were not informed about the real purpose of the test; their behavior was tracked unobtrusively. The results suggest that the method can elicit responses to the inserted problems, such as seeking, pausing, or reloading the web page. However, a third of the subjects acted apprehensively, meaning that they changed their behavior due to being part of a test. In this contribution, we elaborate on the underlying reasons for those experimental biases. We discuss the suitability of different test designs for behavioral assessment and give guidelines on how to quantify and combat biasing factors introduced by the test procedure

    Comparing fixed and variable segment durations for adaptive video streaming: a holistic analysis

    No full text
    HTTP Adaptive Streaming (HAS) is the de-facto standard for video delivery over the Internet. It enables dynamic adaptation of video quality by splitting a video into small segments and providing multiple quality levels per segment. So far, HAS services typically utilize a fixed segment duration. This reduces the encoding and streaming variability and thus allows a faster encoding of the video content and a reduced prediction complexity for adaptive bit rate algorithms. Due to the content-agnostic placement of I-frames at the beginning of each segment, additional encoding overhead is introduced. In order to mitigate this overhead, variable segment durations, which take encoder placed I-frames into account, have been proposed recently. Hence, a lower number of I-frames is needed, thus achieving a lower video bitrate without quality degradation. While several proposals exploiting variable segment durations exist, no comparative study highlighting the impact of this technique on coding efficiency and adaptive streaming performance has been conducted yet. This paper conducts such a holistic comparison within the adaptive video streaming eco-system. Firstly, it provides a broad investigation of video encoding efficiency for variable segment durations. Secondly, a measurement study evaluates the impact of segment duration variability on the performance of HAS using three adaptation heuristics and the dash.js reference implementation. Our results show that variable segment durations increased the Quality of Experience for 54% of the evaluated streaming sessions, while reducing the overall bitrate by 7% on average
    corecore